Design Patterns for Fault Containment
نویسنده
چکیده
Fault containment is an important constituent of fault tolerance. Means for fault containment allow a system to limit the impact of manifested faults to some predefined system boundaries. This document presents some of the best known techniques for fault containment formatted as design patterns. These patterns are elicited from the areas of self-stabilization, specification closure and fault tolerant OS kernels. The presented fault containment patterns are: the Input Guard pattern which confines an error outside the guarded system boundaries; the Output Guard which confines an error inside the guarded system boundaries; and the Fault Container pattern which is the fault tolerant counterpart of the well-known Adapter pattern and which combines the properties of the Input Guard and Output Guard patterns.
منابع مشابه
Error Containment in the Presence of Metastability
Error containment is an important concept in fault-tolerant system design, and techniques like voting are applied to mask erroneous outputs, thus preventing their propagation. In this presentation we will use the example of DARTS, a fault-tolerant distributed clock generation scheme in hardware, to demonstrate that metastability is a substantial threat to error containment. We will illustrate h...
متن کاملA Dissertation Submitted to the Department of Computer Science and the Committee on Graduate Studies of Stanford University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy
Current shared-memory multiprocessors suffer from an inherent fragility, since a single hardware or system software failure can cause the entire machine to crash. This dissertation describes a combination of hardware and software techniques that can be used to provide fault containment for large-scale shared memory machines. With fault containment, the impact of a fault remains limited to only ...
متن کاملLow-Cost Error Containment and Recovery for Onboard Guarded Software Upgrading and Beyond
ÐMessage-driven confidence-driven (MDCD) error containment and recovery, a low-cost approach to mitigating the effect of software design faults in distributed embedded systems, is developed for onboard guarded software upgrading for deep-space missions. In this paper, we first describe and verify the MDCD algorithms in which we introduce the notion of ªconfidence-drivenº to complement the ªcomm...
متن کاملFault - Containment in Self - Stabilizing Distributed Systems
Self-stabilizing systems can automatically recover from arbitrary transient faults, and changes in the environment of the system, without any external intervention. However, in existing distributed self-stabilizing protocols, the performance of recovery is not linked to the severity of the fault. Recovery from failure at even a single component of the system may take a long time and aaect the o...
متن کاملThe Xilinx Isolation Design Flow for Fault-Tolerant Systems
www.xilinx.com 1 © Copyright 2012–2013 Xilinx, Inc. Xilinx, the Xilinx logo, Artix, ISE, Kintex, Spartan, Virtex, Vivado, Zynq, and other designated brands included herein are trademarks of Xilinx in the United States and other countries. All other trademarks are the property of their respective owners. The ability to control system failure modes through fault-tolerant design requires an implem...
متن کامل